专利摘要:
Various embodiments include a device and methods for synchronizing virtualized data, or virtualized data subsets, across a plurality of data repositories. Synchronization can be performed in a data virtualization platform separate from the plurality of physical data repositories without requiring direct access to the plurality of physical data repositories. Additional devices, systems and methods are also disclosed.
公开号:FR3031604A1
申请号:FR1561227
申请日:2015-11-23
公开日:2016-07-15
发明作者:Ramesh Kumar Raghunathan;Ralph Lynn Nichols;Keshava Prasad Rangarajan;Chandra Yeleshwarapu
申请人:Landmark Graphics Corp;
IPC主号:
专利说明:

[0001] TECHNICAL FIELD [0001] The present invention generally relates to apparatus and methods related to data synchronization. Background The term data virtualization describes a data management approach that may include data access and data manipulation in the absence of knowledge of all the particularities of the data, for example, how they are formatted and their physical location. Data virtualization approaches are currently geared toward capabilities that attempt to abstract the technical aspects of stored data to create a logical and common data access point for connecting to different data sources and for translating data. source for a user entity, among others. These technical aspects may include location, storage structure and storage technology, among other physical parameters. [0003] Methods and systems for replication and data synchronization are common for commercial and open source database repositories. Considerable efforts have been made to address the approaches to data synchronization. However, in standard common approaches, there are no direct methods that allow user entities to operate in a virtualized data environment without intervention with the repositories directly. Several approaches, especially those reserved for commercial database offerings, depend on the use of transaction-reroute change-based approaches for data synchronization when the ordered source transactions are all applied in the order to each of the destination systems. In large networks of repositories under active synchronization, such replay mechanisms unnecessarily duplicate unnecessary change transactions with negative performance and latency consequences. Existing approaches also typically use specialized dialects, specific to each type of repository, 1 3031604, and may not be suitable for working with semi-structured, unstructured, custom, and ad hoc data repositories. The use of such repositories can be facilitated by exposing them through data virtualization platforms; however, common data virtualization platforms offer very little or no comprehensive data synchronization support. Brief Description of the Figures [0004] Figure 1 is a flowchart of an exemplary system architecture, according to various embodiments. Figures 2A-2K are flow charts of exemplary system interfaces that may be implemented in the system architecture of Figure 1, according to various embodiments. Figure 3 is a flowchart of an exemplary configuration model according to various embodiments. Figures 4A and 4B are flowcharts of an example of a data synchronization stream, according to various embodiments. Figure 5 is a flowchart of the characteristics of an exemplary central data model, according to various embodiments. Figure 6 is a flowchart of an exemplary data synchronization method, according to various embodiments. Figure 7 is a flowchart of an exemplary system that may be implemented in the example of the system architecture of Figure 1, according to various embodiments.
[0002] DETAILED DESCRIPTION [0011] The following detailed description refers to the accompanying drawings which illustrate, by way of illustration and not by limitation, various embodiments that may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice these and other embodiments. Other embodiments may be used, and structural, logical, and electrical modifications may be made to these embodiments. The various embodiments are not necessarily mutually exclusive, since some embodiments may be associated with one or more other embodiments to create new embodiments. The following detailed description should therefore not be taken in a limiting sense. When managing the data of different systems, an important parameter may include the synchronization of data through these different systems. In other words, as the data changes in one system, the same changes or states of one system must verbatim be reflected in another system. For a data repository that contains certain entities and attributes for these entities, one task would be to synchronize that state with another repository. Such synchronization may include data conflicts between different entities. [0013] A problem in conflict detection that is largely unresolved in several existing approaches involves conflict detection through a graph or object hierarchy case, in which the entire collection is collectively synchronized. when a conflict is detected at any level in this collection. Such detection can be extremely difficult to perform with current methods that treat each object case atomically for synchronization and impose an order before synchronization only to handle repository constraints such as, without limitation, foreign keys. In addition, several approaches generally provide limited support for the specification (s) of complex data subsets that constrain the set of subsets of objects (and the subset of their attributes) that must be synchronized with each other. a source to a destination. In particular, in a data virtualization environment, the specification of a subset of data can cover multiple repositories and involves very complex queries that might be difficult to satisfy with current methods. Another complexity arises when these subset queries also vary dynamically over time based on the information in the multiple repositories. In various embodiments, a data virtualization layer may be structured to be oriented to solve the aforementioned problems. In various embodiments, a data virtualization platform can be structured as a data virtualization layer so that access to the repository objects can be obtained when direct connectivity to the data virtualization layers is achieved. repository objects is not possible. The data virtualization platform can be implemented to work on objects that are exposed through views that can significantly transform the content of the original repository. The data virtualization platform may be implemented to operate on objects, where the object definitions may be different between the source and destination repositories. The data virtualization platform may be implemented to operate on objects that may be composed of attributes simultaneously derived from multiple heterogeneous repositories, for example, a relational database, a spreadsheet, and an XML Internet service. extensible markup). The data virtualization platform can be structured as above without direct intervention with repositories for executing procedures, such as stored procedures and triggers. The data virtualization platform can be structured, unlike the existing methods and systems, to function without assuming that the source and destination entities, in a synchronization and attribute definition procedure, are identical or that each object is synchronized in its entirety with all associated attributes. In addition, the data virtualization platform can be structured, unlike existing standard methods and systems, to operate without exchanging synchronization metadata between synchronization repositories, eliminating the creation of data repositories for storing data. such metadata. In various embodiments, a method, a configuration mechanism, and an executing frame are described such that any one of its repositories can be synchronized when operating in a data environment. virtualized. The embodiments of a data virtualization layer, which abstracts from the connection details and other related details relating to the communication of a user instrument directly to a data repository, can be structured to operate in an arena of data. virtual data that includes the synchronization of multiple repositories. For example, a database of 5 'SQL servers (structured query language), an Oracle database, an Excel file, an Internet service or other data containing electronics can be located in the environment of virtual data, and can be treated identically. In addition, a data virtualization platform can be structured so that all metadata information relating to a synchronization (process, mechanism) should not be stored in any of the synchronized repositories, or transferred from one to another. repository of synchronization to another repository of synchronization. Such a data virtualization platform must not physically modify any of the repositories that are synchronized. In various embodiments, data synchronization may result in no additional data being added to the repositories beyond the entities that need to be synchronized. Two aspects of such an approach may include the non-change of a data repository, and second, the non-displacement of anything from one repository to another repository other than the data of the entities that are synchronized. . These entities and the attributes of the entities that are synchronized must not be identical. Synchronization can include the synchronism of the data parts. For example, data contained in a repository that is structured with three decimal points can be synchronized with data contained in another repository that is synchronized with five decimal points to data having three decimal points. A data virtualization layer, performed by a data virtualization platform, does not store anything persistently; it optionally pushes the synchronized data to the data repository of interest 30 during synchronization. In one embodiment, during a synchronization procedure, when the data is synchronized, only the last state of a repository is synchronized to another repository, so that redundant or unnecessary change transactions do not occur. are not rerun inefficiently. In the synchronization process, changes are made in a destination repository to synchronize with the data in a source repository, at least up to a certain point corresponding to the attributes of an entity in the destination repository . The terms "source and destination" refer to the initiation of a synchronization, in which, during a procedure, a repository is a source and another repository is a destination and, in another procedure, the roles of both The repositories are inverted with respect to the source and the destination. Before the application of any change, a determination can be made to determine whether a change is justified or not. The detection process may include a comparison. The comparison can be performed recursively. The detection mechanism can perform a three-parameter match.
[0003] It compares the value of the source repository, the value in the destination repository, and a previous value that has been either synchronized or moved from one repository to another. Based on this three-parameter comparison, one can determine all the different combinations, and thus identify how to synchronize. The detection mechanism may overlap these three determinations, recursively, to determine what must actually change on the target. In a case where there has been a change, a configuration can be searched in a data virtualization platform to determine if these changes should overwrite the data or if these changes should only be ignored. As compared to ignoring, no action is taken. If changes are to be applied, the comparison mechanism is executed against the incoming data, including the unmodified portions, the existing data, and the previous value. The comparison can, again, be done recursively. [0019] Changes may also be made to a hierarchy. The source and destination entities are atomically processed. In other words, a change at any point in the hierarchy can be treated as an atomic change throughout the hierarchy.
[0004] 6 3031604 The entire hierarchy is synchronized, rather than a single entity, and a single attribute. In various embodiments, synchronization in a data virtualization layer may take into account the hierarchical relationships between the entities. Such relationships can be used to further enhance conflict detection of data from a plurality of sources. Hierarchical grouping can be used in a virtualized database environment to synchronize composite data types in which changes at one or more entities that share a single root ancestor, by convention, can be applied to from one source or another, but both may not be applied. An embodiment of hierarchical grouping can begin with the introduction of a hierarchical configuration, stored in a file or database, which indicates that the related types are hierarchical and the relationships that link the hierarchy. For example, a configuration may indicate that the type D entity is a child of the type A entity by a specific foreign key, and that C is a child of the type B entity by a specific foreign key, which , in turn, is a child of the type A entity by a specific foreign key. Thus, we say that the types A, B, C and D form a hierarchy, the hierarchy [C => B => A, D => A]. Then, a term "hierarchical grouping" or "grouping" can be introduced, A hierarchy indicates the types of a hierarchy, while grouping represents the given entities of the hierarchy. Entities in a grouping are relationally related by foreign keys that link hierarchy 25; that is, their foreign keys described in the configuration correspond to the relational keys of the related parent and the entity types of the hierarchy. For example, consider the case in which an "a" type A entity is related to a "d" type D entity by a specific foreign key described in the configuration, and no other entity is related to "d". Through this specific foreign key. Therefore, ['a', 'dl forms a complete grouping by the hierarchy [C => B => A, D => A]. Embodiments of the hierarchy grouping may be described using the aforementioned terminology. Some embodiments can be achieved by transforming a change log, once extracted from its source, and again when applying to the target. When extracting the change log, one can remember the order of the log 5 by assigning an integer value to each entity, representing the order in which they were encountered in the change log. Then, the hierarchical entities found in the log can be put into their respective groupings by comparing the entities against their prospective parent entities to determine if the foreign key matches, as described in the configuration. The grouping is then compared to the contents of the source database by performing queries using the foreign keys of the configuration with respect to each of the entities in the grouping. If it is found that entities are related to the entities of the grouping, they are added to the grouping as an entry in the change log in which no entity attribute has changed, and assigned to the next value of available integer for the newspaper order. The comparison can continue relative to the newly added entities until no other entity can be found. Once the comparison procedure is complete, all feature sets, those now within the 20 clusters, and entities that are not relational, can be put into a matrix and sorted by their value. assigned order, thus completing the transformation by extraction. [0023] A final part of the hierarchical transformation can occur during the log application time. When attempting to apply a change log, one can once again assign integer values to the change log indicating their order and the hierarchical groupings are grouped together as described in FIG. extraction stage of the source. A global change queue can be created and made ready for new entries. Non-hierarchical entities can be added to the global change queue. At this point, each source bundle can be compared to a target bundle containing the contents of the target virtual database. The target group can be created by taking a copy 3031604 of the cluster root and performing the grouping creation steps described in the source extraction. The comparison of the target and source groupings can be performed by superimposing the trees with the groupings formed by comparing the main keys of the entities, and then adding the changes to a local queue. When it is found that the entities correspond to the master key, the entities can be compared by their remaining attributes, and if they are different, an update of the change log entity can be added to the queue. the expectation changed and to which the order value of the source entity is assigned, and it may be noted that the grouping is in conflict. When you find that the entities exist in the source bundle, but not the target bundle, a change log insertion entity can be added to change the assigned queue with the next integer value of the order , and it can be noted that the grouping is in conflict. When it is found that the entities exist in the target group, but not the source group, a change log deletion entity may be added to change the assigned queue with the next integer value of the order , and it can be noted that the grouping is in conflict. A conflict policy can then be consulted. If the conflict policy indicates that the incoming conflicts should not be applied, the local queue may be rejected. If the conflict policy indicates that the incoming conflict still needs to be applied, the contents of the local queue can be added to the global change queue. At this point, the changes can be collected. As part of a final procedure, the global change queue can be sorted according to the value of the assignment order. When the hierarchical transformation is completed, the contents of the global change queue can be transferred over the remainder of the synchronization process as for the change log to be applied. In various embodiments, the source and destination repositories may be operably linked, which may be referred to as ID (identification) matching. In a relational structure, a master key may be used in such an ID match. Consider 3031604 two important user instruments of data relating to the same objects, using the same name. In a repository, the object is associated with a master key having a value of N and in the other repository the object is associated with a master key having a value of M. In the virtualization platform 5 of data, the master key can be structured as an integer ID; a unique number that is only assigned when each row is created in the repository. When the change reaches the object with a primary value of N, the change request is sent with the name of the object that is held in a configuration file. The name is a natural key.
[0005] Before applying the change in the other repository, the incoming change is examined and it is determined that the incoming change is associated with the name of the object in the repository, which has a primary value of M.  The change game that is applied can be based on a conflict policy, which can be set relative to the master key.  In various embodiments, a master key may comprise several parts, particularly with increased nesting in the relational structure.  In various embodiments, changes at the column level in the virtual data in a virtual environment are tracked.  Therefore, a given repository can synchronize a set of its attributes to a second repository, and can synchronize a completely different set simultaneously to a third repository.  This approach can provide complete flexibility in terms of the fractions of the data that can be moved around different repositories.  Key mapping can be performed in the presence of additional unique constraints.  While database synchronization changes in a virtual database environment, in which entities have additional primary keys and additional constraints, where the unique constraint determines the identity of the entity in preference to the master key a particular type of conflict can be encountered in which two of the same entities, considered to be the same by comparing the additional unique constraint, could have been added on both sides of the virtual database environment, so that we can not add an entity from a source side to a target side without violating the unique constraint, thus producing the so-called "create-create" conflict.  Embodiments, as described herein, can be used to resolve these conflicts automatically by attaching unicast information to the change log file of the entity in the after queue. when pending changes to a target can be rewritten prior to their application by changing the master key in the pre-application change log to match the existing key on the target side.  The method of rewriting the change log can begin by adding the unicast information to a change log when the entity change log is recorded on the source side.  When saving the change, the master key can be saved in the change log.  The method can add the single-character information by registering the tuple of values, one for each of the columns of the unique constraint.  In addition, if one of the columns of the unique constraint represents foreign keys to other entities and it is determined that the related key in the related entity is the primary key of the related entity and the related entity has a unique unique constraint, then the u-tuple of the 20 values from the related entity can be replaced for the input of the foreign key.  The process of replacing tuples with foreign keys may continue on the replaced tuplets until no replacement can be made.  [0030] The rewrite process may end when it updates the master keys in the change log before attempting to apply the change on the target side.  Rewriting can be accomplished by retrieving the unicast constraint from the entity's change log, and then attempting to extract the equivalent primary key from the target side that matches the unicast information.  The target-side entity that corresponds to the single-character tuple 30 can be obtained by a request from the virtual database; the values of the unique constraint columns of the target entity must match the equivalent column in the single-character tuple.  If it is determined that the equivalent column is itself a tuple rather than a single value, it is because this column is a foreign key, in this case, the main key of the foreign entity where the columns of the unique constraint on that foreign entity that corresponds to the tuple are replaced for the tuple in the single-character tuple by the same method.  Since n-tuples contain tuples, the process can recursively evaluate tuples in the master keys, until finally the main key that corresponds to the entire uniqueness of the n-tuples. uplet is obtained.  When obtaining the final master key, the master key may be replaced by the master key in the change log.  The rewrite is complete at this point, since the primary key and the unique constraint columns match in the change log, the conflict generated by having master keys with additional unique constraints has been resolved.  Figure 1 is a flowchart of an embodiment of an exemplary system architecture 10.  The system architecture 10 may include a data virtualization platform 101 managing data streams from the user instruments 100 to the storage 102 so that the user instruments 100 are not directly related to the storage 102 or the components of the system. storage 102, passes directly through the data virtualization platform 101.  The user instruments 100 may contain no information regarding the location or routing to the storage 102 or the storage components 102.  User instruments 100 may include, without limitation, mobile devices, applications, service instrumentalities, and systems.  The 25 user instruments 100 essentially "see" the presentation of the source, or a view of the source, storage 102 and below the user instruments 100, the data virtualization platform 101 takes care of the translation between this view and the actual physical data that is stored in the storage 102.  The data virtualization platform 101 may comprise a destination data server 103, a source data server 104 and a synchronization data server 105.  The destination data server 3031604 103 may comprise a destination data virtualization model 103-1.  The source data server 104 may comprise a source data virtualization model 104-1.  The synchronization data server 105 may comprise a timing display model of the synchronization data 104-1.  The storage 102 may comprise a destination repository 109, a source repository 110, a source repository 111, a synchronization repository 112 and a source repository 113.  These repositories may be embodied as separate physical components, in which each component may be remote from various components of the discrete physical components.  The destination repository 109 may be coupled to a destination data server 103 via a communication path 114 to allow bidirectional communication of the data to the destination data virtualization model 103-1.  The source repository 110 and the source repository 111 may be coupled to the source data server 104 via communication paths 115 and 116, respectively, to allow two-way communication of the data to the source data virtualization model 104-1. .  The synchronization repository 112 may be coupled to the synchronization data server 105 via the communication path 114 to allow bidirectional communication of the data to the synchronization data virtualization model 105-1.  The synchronization repository 112 can store all metadata and related states to what has already been synchronized between two repositories and store what remains to be done.  The source repository 113 may be coupled to the user instruments 100 via a communication path 121 to allow bidirectional communication of the data to the user instruments 100.  Source repository 113 may be structured as a local database of user instruments 100.  The data virtualization platform 101 may be structured to synchronize all the virtualized data, or subsets of the data, if any, across heterogeneous data repositories, whatever the source or origin of the data, such as commercial databases, files, data inside spreadsheets, web services, mobile devices, large data repositories, cloud repositories, No-SQL repositories, or any other type of virtualized data repository, without the need for direct access to these repositories during synchronization.  The data virtualization platform 101 may be structured to operate to perform one or more of the following tasks: read configuration information about sources, destinations, and data mappings; updating a subset of original data for a receiver destination; check the source for new changes since the last check of this type; identify pending changes for the destination since the last synchronization of that type; check for conflict for pending changes 15 implement an appropriate conflict resolution policy; put the entities in a chosen execution order before synchronizing the data; first apply the pending insertions; apply the updates following the application of the pending insertions first, apply the deletions after the application of the updates following the application of the pending insertions 20 first; Track and document any errors encountered during these operations and record a transaction summary of the entire synchronization process.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to periodically invoke procedures to enable various data repositories. to incrementally obtain identical data content across any connected network of repositories in any deployment configuration.  Such a deployment configuration may include, without limitation, a pair-to-peer, star, master-slave, and other configurations.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may include a scheduler, stopwatch, executable task and procedures using these components.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to configure the data virtualization platform 101 or a flat platform. a data virtualization platform similar to the data virtualization platform 101 to specify the mapping of data between the source and destination repositories.  The data virtualization platform 101 or a data virtualization platform 10 similar to the data virtualization platform 101 may include one or more of the following: a configuration schema definition which implements constrain the validity of the configuration information; connection information for the virtualized sources and destinations required for the data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101; parameters, such as a timing interval or frequency, that govern the execution of the data virtualization platform 101 or a data virtualization platform 101 similar to the data virtualization platform 101 ; and mapping between the source and destination entities and their attributes to implement synchronization methods of the data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101.  A data model and / or schema for storing data and metadata can be associated with the data virtualization platform 101 or a data virtualization platform similar to the data platform. Data virtualization 101 and methods of using the data virtualization platform 101 or associated with the use of a data virtualization platform similar to the data virtualization platform 101 described herein. .  The model may include entities and relationships that follow one or more of the following: metadata, including an incremental change tracking counter, associated with the changed attributes of all the entities of all the repositories that are configured by the data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101; the metadata associated with a subset of data from an originating repository to a destination repository as configured by the data virtualization platform 101 or a platform-like data virtualization platform data virtualization platform 101; the metadata associated with the information gathered during the pre-synchronization cycles between source and destination repositories; and any errors associated with the propagation of the actual change associated with a given metadata change.  A data model and / or a schema can be structured to store information of the synchronization transaction.  The synchronization transaction information may include the date and time of the conclusion of the synchronization activity, the unique source identifier, the unique destination identifier, the source entities, the destination entities, source attributes, destination attributes, counting synchronized entities, counting synchronized attributes, counting entities with errors during synchronization, counting attributes with errors during synchronization, and the start and end values of the metadata counter.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to check for the presence of conflict in the pending changes.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to perform one or more of the following: perform a three-parameter match to detect attribute change conflicts by comparing a pound sign, or a unique numeric code, of the source content, the pound sign, or the unique numeric code, of the destination content, and the stored pound sign, or the unique numeric code , the last known synchronized content; take hierarchical relationships between entities into account in order to further improve conflict detection; 16 3031604 and c) skip the pending change if it is detected that the destination already has the same content as the source change.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to apply an appropriate conflict resolution policy that resolves the detected conflicts. Summarized above regarding conflict checking for pending changes, perform a three-parameter match, considering hierarchical relationships and skip a pending change.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to apply an appropriate conflict resolution policy by performing an operation including the determination of the data virtualization platform 101. winner in the case of a conflict as indicated by the configuration presented above to specify a data mapping between the source and destination repositories.  The configuration may include one or more of the following: a configuration schema definition that constrains the validity of the configuration information; connection information for the virtualized sources and destinations required for the data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101; parameters, such as a timing interval or frequency, that govern the execution of the data virtualization platform 101 or a data virtualization platform 101 similar to the data virtualization platform 101 ; and mapping between the source and destination entities and their attributes to implement synchronization methods of the data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform. 101.  The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to apply an appropriate conflict resolution policy that resolves the detected conflicts including the cancellation or the application of the pending change such as 17 3031604 deducted from the determined policy.  [0042] The data virtualization platform 101 or a data virtualization platform similar to the data virtualization platform 101 may be structured to work in association with the metadata change stored as it is. is presented above with respect to a data model and / or schema for storing data and metadata.  Such a combination may be used to provide one or more of the following: the metadata change is tracked at the entity and attribute level thereby allowing partial synchronization of the entity in the event that a destination is interested only in a subset of attributes and entities; the metadata change counter allows incremental synchronization of the last changes only from a source repository to multiple simultaneous destinations each of which may require disparate subsets of the source data; Deleted information is properly propagated even when source repositories do not retain, or use, information about delinquent information; redundant and false cycles of change-related updates are protected from repositories that synchronize symmetrically; remain robust and error-free in the case where the entities and attributes are removed or augmented from the source and destination repository schema; eliminate the need to synchronize the clocks of the server system through the repository synchronization networks; prevent the need to store intermediate copies of the actual change data in any repository; and allow any query specification dynamically at the run level to control the subset of synchronized data between a source and a destination.  Figures 2A-2K are flowcharts of exemplary embodiments of system interfaces that may be implemented in the system architecture of Figure 1.  An interface may be provided by a module that provides executable procedures.  These interfaces can provide, for the realization, methods and systems described herein.  Figures 2A-2K are flow charts of exemplary system interface embodiments that can be implemented in the system architecture of Figure 1.  These interfaces can provide, for the realization of the methods and systems described herein.  Figure 2A is a flowchart of an embodiment of an exemplary change interface 200.  The change interval 200 may have a change counter 200-1 and may be arranged to contain a single repository identifier 200-2 which may be structured as a master key, a change operation 200-3, a 200-4 feature name, an ID 200-9, an attribute name 200-5, a hash of the value of the changed attribute 200-6, a change state 200-7, and an optional error 200-8.  The change operation can be an insert, an update, or a deletion.  The change counter can be structured to maintain a change version in order to keep track of the version stored in the metadata.  He can keep track of every detail.  This change counter may be stored in a synchronization metadata repository, which may be arranged to store metadata but which stores no real value of any of the entities that are synchronized.  The change counter allows tracking through multiple synchronizations.  In addition, a sharp of the actual value, where the sharp is a signature of what the value represents, can be maintained.  A signature of a data entry can be calculated on the pound sign, which allows a comparison of the hash values to determine whether there has been a change or not.  For example, one may have a large file that can be compressed to a number, or to a specific number, so that if the number is different from a previously stored pound, the comparison indicates that the entity has changed from one to another. somehow.  You should not go to the entire file to find out if there has been a change or not in this one.  A comparison of his sharp indicates that there has been a change, which makes it possible to keep track of the version.  Figure 2B is a flowchart of an embodiment of an example of a change collection interface 201.  The change collection interface 201 may include an instrumentality to add a 201-1 change, iterate through collected changes 19 3031604 201-2, retrieve a specific change 201-3, check if the collection contains a specific change 201-4, managed a list of entity keys for the 201-5 change collection, check whether a given change conflicts with this 201-6 change collection, and maintain a 5 201-7 feature count and an attribute count 201-8.  Figure 2C is a flowchart of an embodiment of an exemplary change source interface 202.  The change source interface 202 may include an instrumentality for retrieving the latest changes 202-1, the attributes of each entity exposed by the source for the timing 202-2, the list of data types of the attribute for the 202-3 entity attributes, the key attributes for each 202-4 entity, the key attribute types for each 202-5 entity.  The change source interface 202 may be structured to delete an entity 202-6, insert an entity 2027, update an entity 202-8 and specify a mapping 202-9 to a configured destination entity, as it is described here.  Figure 2D is a flowchart of an embodiment of an example of a synchronizer interface 203.  The synchronizer interface 203 may include the instrumentality for retrieving new source changes 203-1, defining subsets of data from a source to a destination 201-2, synchronizing two repositories 203- 3, reset the 203-4 change tracking metadata, provide the 203-5 encountered errors, and document and report the 203-6 transactions.  The synchronization of the two repositories may include a switchover of one or more of the following: read configuration information about sources, destinations, and data mappings; update the subset of original data for the receiver repository; check the source for new changes since the last check of this type; identify pending changes for the destination since the last synchronization of that type; check for conflict for pending changes apply an appropriate conflict resolution policy; put the entities in a chosen execution order before synchronizing the data; first apply the pending insertions followed by the application of the updates and then the deletions; Track and document any errors encountered during these operations and record a transaction summary of the entire synchronization process.  Figure 2E is a flowchart of an embodiment of an exemplary synchronization specification interface 204.  The synchronization specification interface 204 may include a source repository 204-1, a destination repository 204-2, a repository for storing the 204-3 sync metadata, and a mapping between the source entities and the repository entities. destination 204-4.  Figure 2F is a flowchart of an embodiment of an exemplary synchronization card interface 205.  The synchronization card interface 205 may comprise a list of a source entity 205-1, a query, when executed, on the source repository specifies the target 205-2 subset of data for the repository destination 205-3, is a set of attribute mappings from the source entity to the destination entity 205-4.  Figure 2G is a flowchart of an embodiment of an exemplary synchronization transaction interface 206.  The synchronization transaction interface 206 may provide the stored synchronization transaction information of the attributes including the date and time of the conclusion of the synchronization activity 206-1, a unique source identifier 206-2, a single destination identifier 206-3, source entities 206-4, destination entities 206-5, source attributes 206-6, destination attributes 207-7, and start values 206-8 and end 206-9 of a metadata counter.  FIG. 2H is a flowchart of an embodiment of an exemplary synchronization state interface 207.  The synchronization status interface 207 may provide the status of a current synchronization operation via the cyclic states between the success tags 207-1, pending 207-2, error 207-3, manual 207- 4, jumped 207-5 and in source 207-5.  FIG. 21 is a flowchart of an embodiment of an exemplary change hash interface 208.  The change 30 hash interface 208 may include an instrumentality to provide a unique pound or numeric code or any attribute value 208-1.  The change shard interface 208 may include an algorithm used to calculate this hash value 208-2, and various data structures to hold groups of such sharps, such as 208-3 collections, 208-4 cards, and the like. and trees 208-4.  Figure 27 is a flowchart of an embodiment of an exemplary synchronization operation interface 209.  Synchronization operation interface 209 may describe modes and ways of changes such as insertions 209-1, updates 209-2, deletions 209-3 and no change 209-4.  Figure 2K is an exemplary flow diagram of an example of a timing exception interface 210.  The synchronization exception interface 210 may issue a synchronization error message 210-1 and any context associated with this error message 210-2.  Figure 3 is a flowchart of an embodiment of an exemplary configuration template for a configuration set 300.  The configuration set 300 may include parameters 301 and a specification 302.  The parameters 301 may include, without limitation, an accuracy parameter 303, a rounding parameter 304, and an interval parameter 305.  The interval parameter 305 can specify the synchronization frequency to be performed.  The specification 302 includes the configuration data of a card 306, a source 307, a destination 308 and a synchronization repository 309.  The synchronization repository 309 may include connection information 320.  The card 306 may comprise configuration data of a source entity 310, a destination entity 311, an attribute card 312 and a subset request 313.  The attribute map 312 may include configuration data for a source attribute 321 and a destination attribute 322.  The configuration data for the source 307 may include an ID 314, a contention policy 315, and connection information 316.
[0006] The conflict policy 315 can be embodied in a number of ways. The conflict policy 315 may be the identity of a winner of a conflict. The conflict policy 315 may be a set of rules for determining the winning entity of the conflict. The configuration data for destination 308 may include ID 317, conflict policy 318, and connection information 319. Conflict policy 318 can be embodied in a number of ways. The conflict policy 318 may be the identity of a conflict winner. The conflict policy 318 may be a set of rules for determining the winning entity of the conflict. Figures 4A and 4B are flowcharts of an embodiment of an example of a data synchronization stream. Figures 4A illustrate a flow rate 400-1 to prepare a timing procedure. At 401, source data virtualization is performed. At 402, virtualization of the destination data is performed. At 403, synchronization data virtualization is performed. At 404, a periodic synchronization task is scheduled. Prior to synchronization, the configuration data is enabled in the data virtualization layer so that synchronization in the data virtualization layer, via the data virtualization platform, such as the virtualization platform 101 of Figure 1, can be performed distinctly from a plurality of physical data repositories without the need for direct access to the plurality of data repositories during synchronization. Referring to Figure 1 as an example, during flow 400 data can be communicated from a destination repository 109 to the data visualization model of destination 103-1, the data can be communicated. from source repositories 110 and 111 to the source data display model 104-1, and from the synchronization repository 112 to the synchronization data display model 105-1. [0060] Figures 4A illustrate a flow rate execution 400-2 to perform a synchronization procedure. At 405-1, an indication may be provided to execute the synchronization procedure at a specified period. Other 23 3031604 triggers can be used to initiate the synchronization procedure. The execution of the synchronization procedure can begin at 405-2 in response to the occurrence of the specified period or trigger detection. At 406, the configuration information is read. The configuration information read can include which sources, which destinations, all the mappings between which entities can synchronize with which entities, the time interval, everything that needs to be used to handle synchronization at the virtualization layer level. data. At 407, a subset of data for the destination is updated. At 408, the source is checked for new changes. At 409, pending changes to the destination are obtained. At 410, a conflict check is performed. At 411, a conflict resolution policy is applied. At 412, entities for synchronization are ordered. At 413, inserts are applied. At 414, updates are applied. At 415, deletions are applied. At 416, the errors are documented. At 417, the transaction of synchronization is recorded. At 418, the synchronization process is terminated. The execution flow can be executed for each pair of entities and each combination. Figure 5 is a flowchart of the features of an embodiment of an exemplary core data model. The kernel data model may include a change counter 500 and a change transaction 501. The change counter 500 may include a counter 502, a source II 50), a destination ID 504, an entity name 505, an attribute name 506, a principal attribute name or names 507, a change operation 508, a hash of attribute values 509 and an error message 511. The change counter 500 allows the Data visualization layer to keep track of every detail, every column, every entity and every pair of repositories with no time limit. This is one of the items that can be stored in the metadata for synchronization. In various embodiments, none of the actual values of any of the entities that are synchronized are stored in the synchronization metadata repository. The only information stored in these embodiments is the metadata, which includes the change counter 500 which is the most common type of stored metadata. The change transaction 501 can be structured to provide an accounting of the synchronization procedure. The change transaction may include a timestamp 512, a source ID 513, a destination ID 514, a source entity name 515, a destination entity name 516, an entity count 517, a count 518, an entity error count 519, an attribute error count 520, a count start 521, and an end count 522. The change transaction 501 allows a record, for example, recording the time during which a given repository is synchronized with another identified repository number, the total number of synchronized entities, the total number of synchronized attributes, all found errors, a start time, and an end time, etc. Figure 6 is a flow chart of an embodiment of an exemplary data synchronization method. At 610, synchronization of the virtualized data, or subsets of the virtualized data, is synchronized across a plurality of data repositories. At 620, synchronization is performed in a data virtualization platform separate from the plurality of data repositories without requiring direct access to the plurality of data repositories. A method 2 may comprise reading the configuration data in a data virtualization platform, the configuration data being data relating to source repositories, destination repositories and data mappings, the platform data virtualization platform comprising one or more servers, the data virtualization platform being operable to communicate with a user device so that the user device accesses the data from the storage repositories through the virtualization platform of the server. data in the absence of direct connectivity to the storage repository; updating an original data sub-set destined for a destination repository, the subset of original data from a source; Source verification for new changes since the last check of the 3031604 source; identifying pending changes for the destination repository since the last synchronization of the destination repository, pending changes being generated in one or more entities; Conflict verification for pending changes the application of a dispute resolution policy; arranging one or more entities in a fixed execution order before synchronizing the data; and data synchronization. A method 3 may include the features of method 2 and may include first applying pending insertions; the application of the 10 updates after the application of pending insertions; and applying the deletions identified after applying the updates following the first application of the pending inserts. A method 4 may include the features of any of methods 2 to 3 and may include tracking and documenting errors encountered during operations from reading the configuration data in a platform of the present invention. data virtualization for data synchronization; recording a transaction summary of a complete synchronization process performed during data synchronization. A method 5 may include the features of any of methods 2 to 4 and may include periodic invocation of reading, updating, checking for new changes, identification, conflict verification; applying, arranging and synchronizing to activate a plurality of data repositories to incrementally obtain identical data content across a connected repository network. A method 6 may include the features of any of methods 2 to 5 and may include the specification of data mapping between the source and destination repositories, the data mapping comprising: a definition of schemes of configuration that imposes a constraint on the validity of the configuration data; connection data for virtualized sources and destinations; parameters 3031604 including the synchronization interval and attributes of the source and destination repositories. A method 7 may comprise the features of any of methods 2 to 6 and may include the use of a data model and schema for storing data and metadata, the data model. having quantities and relationships that follow one or more of the following: metadata, including an incremental change tracking counter, associated with changed attributes of all entities in all repositories, metadata associated with a sub-item set of data from an original repository to a destination repository; the metadata associated with the data collected in previous synchronization cycles between the source and destination repositories; errors associated with the propagation of the actual change associated with any change in metadata; or stored synchronization transaction data comprising: the date and time of the conclusion of the synchronization activity; single source identifier, unique destination identifier, source entities, destination features, source attributes, destination attributes, synchronized feature counting, synchronized attribute counting, feature counting with errors during synchronization, counting of attributes with errors during synchronization, and the start and end values of the metadata counter. A method 8 may include the features of any of methods 2 to 7 and may include conflict checking for pending changes including: checking a three-parameter match to detect conflicts of change attributes by comparing a pound sign, or a unique numeric code, of the source content, the pound sign, or unique numeric code, of the destination content, and the stored pound, or unique numeric code, of the last known synchronized content; to take into account the hierarchical relations between the entities; and skip the change pending 30 if it is detected that the destination already has the same content as the source change. A method 9 may include the features of any of methods 2 to 8 and may include applying a conflict resolution policy to resolve the detected conflicts, applying the resolution policy. conflicts including determining a winner in the event of a conflict and canceling or applying the pending change derived from the determined policy. A method 10 may include the features of any of methods 2-9 and may include, in association with the stored change metadata, one or more of the following: tracking change metadata at the level of the entity and the attribute, allow a partial synchronization of the entity in the case where a destination is interested only in a subset of the attributes and entities; Use a metadata change counter to allow incremental synchronization of the last changes only from one source repository to multiple concurrent destinations. propagation of the deleted information even when the source repositories do not retain, or provide, data relating to deleted information; or the prevention of redundant and false cycles of change akin to repository updates that synchronize symmetrically. The characteristics of any of the various methods, as described herein, or other combinations of features may be combined in a procedure according to the teachings described herein. In various embodiments, a computer-readable, non-transitory storage device may include instructions stored thereon which, when executed by a device, may cause the device to perform operations. operations comprising one or more characteristics similar to or identical to the features of the methods and techniques described herein. The physical structures of such instructions can be exploited by one or more processors. Execution of these physical structures may cause the device to perform operations to: synchronize the virtualized data, or virtualized data subsets, across a plurality of data repositories; and performing synchronization on a data virtualization platform separate from the plurality of data repositories without the need for direct access to the plurality of data repositories. These instructions may include instructions for: reading the configuration data in a data virtualization platform, the configuration data being data relating to source repositories, destination repositories and data mappings, the data virtualization platform comprising one or more servers, the data virtualization platform being operable to communicate with a user device so that the user device accesses the data from the storage repositories through the platform; form of data virtualization in the absence of direct connectivity to the storage repository; updating an original subset of data destined for a destination repository, the subset of original data from a source; source verification for new changes since the last source check; identifying pending changes for the destination repository since the last synchronization of the destination repository, pending changes being generated in one or more entities; Conflict verification for pending changes the application of a conflict resolution policy; arranging one or more of the entities in a fixed execution order before synchronizing the data; and data synchronization. These instructions may include instructions for: applying pending insertions first; Apply updates after applying pending insertions and apply identified deletions after applying the updates following the application of the pending insertions first. These instructions may include instructions for: tracking and documenting errors encountered during read operations of the configuration data in the data virtualization platform to synchronize the data; and record a transaction summary of a complete synchronization process performed during data synchronization. In addition, a computer readable storage device herein is a physical device that stores data represented by the physical structure within the device. Such a physical device is a non-transitory device. Examples of device-readable storage devices may include, without limitation, read-only memory (ROM), random access memory (RAM), magnetic disk storage device, optical storage device, flash memory and other electronic, magnetic and / or optical memory devices. A system 1 may comprise: a data virtualization platform comprising: one or more servers; a communication interface arranged to receive data from and transmit data to user instruments; a communication interface arranged to receive data from and transmit data to storage repositories, the data virtualization platform being structured to synchronize within the discrete data virtualization platform; a plurality of data repositories without the need for direct access to the plurality of data repositories. A system 2 may comprise the structure of the system 1 and may include the structured data virtualization platform for: reading a data configuration in the data virtualization platform, the configuration data being data relating to source repositories, destination repositories, and data mappings; updating a subset of original data for a destination repository, the subset of original data from a source; check the source for new changes since the last source check; identify pending changes for the destination repository since the last synchronization of the destination repository, the pending changes being generated in one or more entities; check for conflicts for pending changes apply a conflict resolution policy; order the one or more entities in a fixed execution order before synchronizing the data; and data synchronization. [0079] A system 3 may comprise the structure of system 1 to 2 and may include the structured data virtualization platform for: first applying pending insertions; apply updates after 3031604 application pending insertions; and apply identified deletions after applying the updates following the first application of the pending inserts. [0080] A system 4 may comprise the structure of the system 1 to 3 and may include the structured data virtualization platform for: tracking and documenting errors encountered during read operations of the configuration data in the platform; form of data virtualization to synchronize data; and record a transaction summary of an entire synchronization process performed during data synchronization. A system 5 may comprise the structure of any of the systems 1 to 4 and may include one or more of the servers comprising a destination data server having a destination data visualization model; a source data server having a source data visualization model and a synchronization data server having a synchronization data visualization model. A system 6 may comprise the structure of any one of the systems 1 to 5 and may include the data virtualization platform 20 comprising one or more of the following: a change interface having a counter of change and arranged to contain a unique repository identifier, a change operation, an entity name, an attribute name, a hash of the changed attribute value, and a change state; a structured change collection interface to add a change, iterate through the collected changes, retrieve a specific change, check if the collected changes contain a specific change, manage a list of entity keys for the change interface collection of changes, and check whether a given change conflicts with the changes collected; a structured synchronizer interface for retrieving new source changes, a subset set of data from a source to a destination, synchronizing two repositories relative to each other, resetting the tracking metadata of 31 3031604 change, describe errors encountered and document and report transactions; a structured change source interface for retrieving the latest changes, attributes of each entity exposed by a synchronization source, a list of attribute data types for the entity attributes, key attribute types for each entity, and attribute keys for each entity, and structured to delete an entity, insert an entity, update an entity, and specify a mapping for a specified destination entity, a synchronization specification interface having a reference repository. source, a destination repository, a synchronization repository for storing the synchronized metadata, and a synchronization map between the source entities and the destination entities, a synchronization case interface having a source entity list, a identification of a destination repository, a request for a source subset which, when it is e executed on a source entity, specifies a targeted subset of data for the destination repository, and a set of attribute mappings from the source entity to the destination entity; a synchronization transaction interface that provides the attributes for storing the synchronization transaction information including the date and time of the conclusion of the synchronization activity, a unique source identifier, a unique destination identifier, source entities, destination entities, source attributes, destination attributes, and the start and end values of a metadata counter; a synchronization status interface that reports the status of a current synchronization operation through states that cycle between the success, pending, error, manual, skipped, and source tags; a structured change pound interface to create a unique pound or numeric code or attribute value using an algorithm used to calculate a hash value, and data structures to maintain groups of sharps comprising collections, maps and trees; a synchronization operation interface for describing modes and ways of changes among: no change, insert, update and deletion; or a synchronization exception interface that provides a synchronization error message and context associated with the synchronization of an error message. A system 7 may comprise the structure of any of the systems 1 to 6 and may include the change interface operation of change comprising an insert, an update or a deletion. FIG. 7 is an exemplary flow diagram of an exemplary system 700 that can be implemented in the system architecture example 10 of FIG. 1. The system 700 can be implemented as a system. a general structure of one or more components in the system architecture 10. The system 700 may be arranged to perform various operations on the data in a manner similar or identical to any of the described processing techniques. in this document. The system 700 may comprise a processor 741, a memory 742, an electronic device 743 and a communication unit 745. The processor 741, the memory 742 and the communication unit 745 may be arranged to operate in the form of a computer. processor unit for controlling the operation of the data virtualization platform 101 or the components of the virtualization platform 101. In various embodiments, the processor 741 can be embodied as a processor or as a processor. a group of processors that can operate independently depending on the assigned function.
[0007] Memory 742 may be embodied as one or more databases. The communication unit 745 may comprise communications between user instruments and a data virtualization platform and / or between the data virtualization platform and physical data storage repositories. The communication unit 745 can use combinations of wired communication technologies and wireless technologies. The system 700 may also include a bus 747, wherein the bus 747 provides electrical conductivity among the system components 700. The bus 747 may include an address bus, a data bus, and a control bus. , each configured independently. The bus 747 can be embodied using a number of different communication means that allow the distribution of system components 700. The bus 747 may include an instrumentality for network communication. The use of the bus 747 may be controlled by the processor 741. In various embodiments, peripheral devices 746 may include displays, additional storage memory, or other control devices that may operate in combination. with the processor 741 where the memory 742. The peripheral devices 746 can be arranged with a display, in the form of a distributed component, which can be used with the instructions stored in the memory 742 to implement a user interface 762 to manage the device. operation of the system 700 according to its implementation in the system architecture for data virtualization. Such a user interface 762 may operate in conjunction with a communication unit 745 and the bus 747. [0089] The structures and techniques as described herein may serve as a basis for oriented products to handle a wide variety of applications. data management tasks, especially those that are complex. The use of a data virtualization platform is a mechanism for managing such complexity. The data virtualization platform can create new workflows and techniques to collaborate with opaque and hard-to-implement user tools and instruments without making significant custom changes to the data repository and middleware additions. . The data virtualization platform can provide efficient data integration and consistency across applications and systems, which can enable enhanced activation and management of data management tasks. [0090] While specific embodiments have been illustrated and described herein, it will be understood by those skilled in the art that any arrangement that is calculated to achieve the same purpose can be replaced by the specific embodiments described. Various embodiments utilize permutations and / or combinations of the embodiments described herein. It should be understood that the above description is illustrative, not restrictive, and that the phraseology and terminology used herein are used for descriptive purposes.
[0008] Embodiments of the foregoing and other embodiments will be apparent to those skilled in the art after studying the foregoing description. 35
权利要求:
Claims (3)
[0001]
REVENDICATIONS1. A method characterized by comprising: synchronizing the virtualized data, or virtualized data subsets, across a plurality of data repositories; and performing synchronization in a data virtualization platform separate from the plurality of data repositories without requiring direct access to the plurality of data repositories.
[0002]
2. Method according to claim 1, characterized in that it comprises the reading of the configuration data in a data virtualization platform, the configuration data being data concerning source repositories, destination repositories and data repositories. data mapping, the data virtualization platform comprising one or more servers, the data virtualization platform operating to communicate with a user device so that the user device accesses the data from the storage repositories through the data virtualization platform without direct connectivity to the 20 storage repositories; updating a subset of original data for a destination repository, the subset of original data from a source; source verification for new changes since the last source check; identifying pending changes for the destination repository since a last synchronization of the destination repository, pending changes being generated in one or more entities conflict checking for pending changes; 30 the application of a conflict resolution policy; arranging one or more entities in a fixed execution order before synchronizing. data; and 3031604 37 data synchronization.
[0003]
3. The method of claim 2, characterized in that the method comprises first applying pending insertions; Applying updates after applying pending inserts and applying deletions identified after applying the updates following the first application of the insertions expected. The method of claim 2, characterized in that the method includes tracking and documenting errors during read operations of the configuration data in the data virtualization platform to synchronize the data; and recording a transaction summary of a complete synchronization process that performs data synchronization. 5. The method of claim 2, characterized in that the method comprises the periodic invocation of reading, updating, checking new changes, identification, conflict verification; applying, arranging and synchronizing to activate a plurality of data repositories to incrementally obtain identical data content through a connected repository network. The method of claim 2, characterized in that the method comprises specifying data mapping between source and destination repositories, the data mapping comprising: defining a configuration scheme that imposes a constraint on the validity of the configuration data; connection data for virtualized sources and destinations parameters including the synchronization interval; and the attributes of the source and destination repositories. The method of claim 2, characterized in that the method comprises using a data model and schema for storing data and metadata, the data model including quantities and relationships. which follow one or more of the following: metadata, including an increment change tracking counter, associated with the changed attributes of all entities in all repositories, the metadata associated with a subset of data from an original repository to a destination repository The metadata associated with data collected during previous synchronization cycles between the source and destination repositories; an error associated with the propagation of the actual change associated with any change in metadata; or stored synchronization transaction data including: the date and time of the conclusion of the synchronization activity; the unique source identifier, the unique destination identifier, the source entities, the destination entities, the source attributes, the destination attributes, the counting of the synchronized entities, the counting of the synchronized attributes, the counting of the entities with errors during synchronization, counting attributes with errors during synchronization, and the start and end values of the metadata counter. The method of claim 2, characterized in that the contention check for pending changes includes: checking a three-parameter match for detecting attribute change conflicts by comparing a pound sign, or a code single digital source content, the pound sign, or unique numeric code, of the destination content, and the stored pound, or unique numeric code, of the last known synchronization content; the taking into account of the hierarchical relations between the entities; and skipping change pending if it is detected that the destination already has the same content as the source change. The method of claim 2, characterized in that the application of a conflict resolution policy resolves the detected conflicts, the application of the conflict resolution policy includes the determination of a winner in the case of a dispute resolution policy. a conflict and the cancellation or application of the pending change deduced from the policy determined. The method of claim 2, characterized in that, in association with the stored change metadata, the method comprises one or more of the following: tracking change metadata at the entity and the attribute, allowing partial synchronization of the entity in the case where a destination is only interested in a subset of the attributes and entities; Using a metadata change counter to enable incremental synchronization of only the last changes from a source repository to multiple simultaneous destinations; propagation of the deleted information when the source repositories do not retain, or provide, data about the deleted information; or the prevention of redundant and false cycles of change-related updates in repositories that synchronize symmetrically. 11. A system characterized in that it comprises: a data virtualization platform: one or more servers; a communication interface arranged to receive data from and transmit data to user instruments; a communication interface arranged to receive data from and transmit data to storage repositories, the data virtualization platform being structured to perform synchronization within the data virtualization platform 3031604 distinct from the plurality of data repositories without requiring direct access to the plurality of data repositories. The system of claim 11, characterized in that the data virtualization platform is structured to: read configuration data into the data virtualization platform, the configuration data being repository data. source, destination repositories, and data mappings updating a subset of original data for a destination repository, the subset of original data from a source; Source verification for new changes since the last source check Identifying pending changes for the destination repository since a last synchronization of the destination repository, the pending changes being generated in one or more entities; Conflict verification for pending changes the application of a conflict resolution policy; Arranging one or more entities in a fixed execution order before synchronizing the data; and data synchronization. The system of claim 12, characterized in that the data virtualization platform is structured for: first applying pending insertions; Applying updates after applying pending inserts and applying the identified deletions after application of the updates 30 following the application of the pending insertions first. The system of claim 12, characterized in that the data virtualization platform is structured to: track and document errors during read operations of the configuration data in the data virtualization platform. to synchronize the data; and record a transaction summary of a complete synchronization process that performs data synchronization. The system of claim 12, characterized in that one or more servers comprise: a destination data server having a destination data visualization pattern; a source data server having a source data visualization model; and a data synchronization server having a display pattern of the synchronization data. The system of claim 12, characterized in that the data virtualization platform comprises one or more of the following: a change interface having a change counter and arranged to maintain a unique repository identifier, an operation change, an entity name, an attribute name, a hash of the changed attribute value, and a change state; a structured change collection interface for adding a change, iterating through collected changes, retrieving a specific change, checking whether the collected changes contain a specific change, managing a list of entity keys for the interface of collection of changes and check whether a given change conflicts with the changes collected; a structured synchronizer interface for retrieving new source changes, defining subsets of data from a source to a destination, synchronizing two repositories relative to each other, resetting the change tracking metadata , describe the errors encountered and document and report transactions; a structured change source interface for retrieving the latest changes, the attributes of each entity exposed by a source for synchronization, a list of attribute data types for the entity attributes, key attribute types for each entity and key attributes for each entity, and structured to delete an entity, insert an entity, update an entity and specify a mapping to a configured destination entity; a synchronization specification interface having a source repository, a destination repository, a synchronization repository for storing the synchronization metadata, and a synchronization map between the source entities and the destination entities; a synchronization card interface having a source entity list, an identification of a destination repository, a request for a source subset that, when executed on a source entity, specifies a sub-set of the target data set for the destination repository, and an attribute mapping set from the source entity to the destination entity; a synchronization transaction interface that provides the attributes for storing information of the synchronization transaction including the date and time of the conclusion of the synchronization activity, a unique source identifier, a unique destination identifier, source entities, destination entities, destination attributes, and the start and end values of a metadata counter; a synchronization status interface which provides the status of a current synchronization operation via the states between the success, pending, error, manual, skipped and source tags; a structured change number interface to provide a unique pound or numeric code or attribute value using an algorithm used to calculate a hash value, and data structures to maintain groups of sharps comprising collections, maps and trees; a synchronization operation interface for describing modes and modes of change among: no change, insertion, update or deletion; or a synchronization exception interface that provides an error message and a synchronization context associated with the synchronization error message. The system of claim 16, characterized in that the change interface change operation comprises an insert, an update or a deletion. 18. Non-transitory storage device readable by a device characterized in that it comprises instructions stored thereon, which, when executed by a device, causes the device to perform operations to: synchronize data virtualized, or virtualized data subsets, across a plurality of data repositories; and performing synchronization in a data virtualization platform separate from the plurality of data repositories without requiring direct access to the plurality of data repositories. 19. The non-transitory storage device readable by a device of claim 18, characterized in that the instructions include instructions for: reading the configuration data in a data virtualization platform, the configuration data being data relating to source repositories, destination repositories and data mappings, the data virtualization platform comprising one or more servers, the data virtualization platform operating to communicate with a user device so that the user device accesses the data from the storage repositories through the data virtualization platform without direct connectivity to the storage repository; updating a subset of original data destined for a destination repository, the subset of original data from a source; check the source for new changes since the last source check; 5 identifying pending changes for the destination repository since a last synchronization of the destination repository, the pending changes being generated in one or more entities; check if there are conflicts for pending changes apply a conflict resolution policy; Placing one or more entities in order in a fixed execution order prior to data synchronization; and synchronize data. 20. A non-transitory storage device readable by a device of claim 19, characterized in that the instructions include instructions for: first applying pending insertions; Apply updates after applying pending insertions and apply identified deletions after application of updates 20 following the application of pending insertions first. 21. A non-transitory storage device readable by a device of claim 19, characterized in that the instructions include instructions for: tracking and documenting errors during read operations of the configuration data in the platform data virtualization to synchronize data; record a transaction summary of a complete synchronization process that performs data synchronization.
类似技术:
公开号 | 公开日 | 专利标题
FR3031604A1|2016-07-15|APPARATUS AND METHODS FOR SYNCHRONIZATION OF DATA
US10990590B2|2021-04-27|Aggregation framework system architecture and method
Carpenter et al.2020|Cassandra: the definitive guide: distributed data at web scale
CN107122443B|2019-09-17|A kind of distributed full-text search system and method based on Spark SQL
US9262462B2|2016-02-16|Aggregation framework system architecture and method
CN110300963A|2019-10-01|Data management system in large-scale data repository
KR102307371B1|2021-10-05|Data replication and data failover within the database system
US10055410B1|2018-08-21|Corpus-scoped annotation and analysis
US10467250B2|2019-11-05|Data model design collaboration using semantically correct collaborative objects
US10445321B2|2019-10-15|Multi-tenant distribution of graph database caches
US8489547B2|2013-07-16|System and method for transforming configuration data items in a configuration management database
US10761908B2|2020-09-01|Distillation of various application interface data structures distributed over distinctive repositories to form a data source of consolidated application interface data components
US10540383B2|2020-01-21|Automatic ontology generation
CN109074387A|2018-12-21|Versioned hierarchical data structure in Distributed Storage area
US10901973B1|2021-01-26|Methods and apparatus for a semantic multi-database data lake
US20170116303A1|2017-04-27|Unified data model
Newman et al.2008|A scale-out RDF molecule store for distributed processing of biomedical data
Saxena et al.2018|Concepts of HBase archetypes in big data engineering
US20190361999A1|2019-11-28|Data analysis over the combination of relational and big data
Li2016|Introduction to Big Data
CN109150964A|2019-01-04|A kind of transportable data managing method and services migrating method
Cosulschi et al.2013|Implementing bfs-based traversals of rdf graphs over mapreduce efficiently
US20210200732A1|2021-07-01|Tree-like metadata structure for composite datasets
US20210365451A1|2021-11-25|Query content-based data generation
Paneva-Marinova et al.2019|Intelligent Data Curation in Virtual Museum for Ancient History and Civilization
同族专利:
公开号 | 公开日
GB201710262D0|2017-08-09|
AR102833A1|2017-03-29|
NO20171080A1|2017-06-30|
AU2015375497A1|2017-07-13|
US20170308602A1|2017-10-26|
GB2550502A|2017-11-22|
CA2972382A1|2016-07-14|
GB2550502B|2021-11-10|
WO2016111697A1|2016-07-14|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
US20140025646A1|2011-03-28|2014-01-23|Telefonaktiebolaget L M Ericsson |Data management in a data virtualization environment|
US8131739B2|2003-08-21|2012-03-06|Microsoft Corporation|Systems and methods for interfacing application programs with an item-based storage platform|
US7779218B2|2005-07-22|2010-08-17|Hewlett-Packard Development Company, L.P.|Data synchronization management|
US7653668B1|2005-11-23|2010-01-26|Symantec Operating Corporation|Fault tolerant multi-stage data replication with relaxed coherency guarantees|
US8655850B2|2005-12-19|2014-02-18|Commvault Systems, Inc.|Systems and methods for resynchronizing information|
US7539827B2|2006-07-19|2009-05-26|Microsoft Corporation|Synchronization of change-tracked data store with data store having limited or no change tracking|
US7979662B2|2007-12-28|2011-07-12|Sandisk Il Ltd.|Storage device with transaction indexing capability|
US8706690B2|2008-05-12|2014-04-22|Blackberry Limited|Systems and methods for space management in file systems|
US9411864B2|2008-08-26|2016-08-09|Zeewise, Inc.|Systems and methods for collection and consolidation of heterogeneous remote business data using dynamic data handling|
US8229936B2|2009-10-27|2012-07-24|International Business Machines Corporation|Content storage mapping method and system|
US8380661B2|2010-10-05|2013-02-19|Accenture Global Services Limited|Data migration using communications and collaboration platform|
KR101697979B1|2010-11-23|2017-01-19|삼성전자주식회사|Method and apparatus for syncronizing data in connected devices|
US8688635B2|2011-07-01|2014-04-01|International Business Machines Corporation|Data set connection manager having a plurality of data sets to represent one data set|
WO2013019869A2|2011-08-01|2013-02-07|Actifio, Inc.|Data fingerpringting for copy accuracy assurance|
US9338757B2|2011-10-03|2016-05-10|Texas Instruments Incorporated|Clock synchronization and centralized guard time provisioning|
GB2505881A|2012-09-12|2014-03-19|Ibm|Determining common table definitions in distributed databases|
US10701149B2|2012-12-13|2020-06-30|Level 3 Communications, Llc|Content delivery framework having origin services|
US10599671B2|2013-01-17|2020-03-24|Box, Inc.|Conflict resolution, retry condition management, and handling of problem files for the synchronization client to a cloud-based platform|
US20140279899A1|2013-03-15|2014-09-18|Unisys Corporation|Data bus architecture for inter-database data distribution|US10678663B1|2015-03-30|2020-06-09|EMC IP Holding Company LLC|Synchronizing storage devices outside of disabled write windows|
US10846115B1|2015-08-10|2020-11-24|Amazon Technologies, Inc.|Techniques for managing virtual instance data in multitenant environments|
US10970311B2|2015-12-07|2021-04-06|International Business Machines Corporation|Scalable snapshot isolation on non-transactional NoSQL|
US10692015B2|2016-07-15|2020-06-23|Io-Tahoe Llc|Primary key-foreign key relationship determination through machine learning|
US10536476B2|2016-07-21|2020-01-14|Sap Se|Realtime triggering framework|
US10482241B2|2016-08-24|2019-11-19|Sap Se|Visualization of data distributed in multiple dimensions|
US10542016B2|2016-08-31|2020-01-21|Sap Se|Location enrichment in enterprise threat detection|
GB201615745D0|2016-09-15|2016-11-02|Gb Gas Holdings Ltd|System for analysing data relationships to support query execution|
GB201615747D0|2016-09-15|2016-11-02|Gb Gas Holdings Ltd|System for data management in a large scale data repository|
US10630705B2|2016-09-23|2020-04-21|Sap Se|Real-time push API for log events in enterprise threat detection|
US10673879B2|2016-09-23|2020-06-02|Sap Se|Snapshot of a forensic investigation for enterprise threat detection|
US10534908B2|2016-12-06|2020-01-14|Sap Se|Alerts based on entities in security information and event management products|
US10530792B2|2016-12-15|2020-01-07|Sap Se|Using frequency analysis in enterprise threat detection to detect intrusions in a computer system|
US10534907B2|2016-12-15|2020-01-14|Sap Se|Providing semantic connectivity between a java application server and enterprise threat detection system using a J2EE data|
US10552605B2|2016-12-16|2020-02-04|Sap Se|Anomaly detection in enterprise threat detection|
US20180176234A1|2016-12-16|2018-06-21|Sap Se|Bi-directional content replication logic for enterprise threat detection|
US10764306B2|2016-12-19|2020-09-01|Sap Se|Distributing cloud-computing platform content to enterprise threat detection systems|
US10389594B2|2017-03-16|2019-08-20|Cisco Technology, Inc.|Assuring policy impact before application of policy on current flowing traffic|
US10530794B2|2017-06-30|2020-01-07|Sap Se|Pattern creation in enterprise threat detection|
WO2019084781A1|2017-10-31|2019-05-09|EMC IP Holding Company LLC|Management of data using templates|
CN107958023A|2017-11-06|2018-04-24|北京华宇信息技术有限公司|Method of data synchronization, data synchronization unit and computer-readable recording medium|
US10986111B2|2017-12-19|2021-04-20|Sap Se|Displaying a series of events along a time axis in enterprise threat detection|
US10681064B2|2017-12-19|2020-06-09|Sap Se|Analysis of complex relationships among information technology security-relevant entities using a network graph|
US10866963B2|2017-12-28|2020-12-15|Dropbox, Inc.|File system authentication|
US11086901B2|2018-01-31|2021-08-10|EMC IP Holding Company LLC|Method and system for efficient data replication in big data environment|
US10754737B2|2018-06-12|2020-08-25|Dell Products, L.P.|Boot assist metadata tables for persistent memory device updates during a hardware fault|
CN110958287A|2018-09-27|2020-04-03|阿里巴巴集团控股有限公司|Operation object data synchronization method, device and system|
US10942904B2|2018-10-09|2021-03-09|Arm Limited|Mapping first identifier to second identifier|
US11204940B2|2018-11-16|2021-12-21|International Business Machines Corporation|Data replication conflict processing after structural changes to a database|
US11138061B2|2019-02-28|2021-10-05|Netapp Inc.|Method and apparatus to neutralize replication error and retain primary and secondary synchronization during synchronous replication|
法律状态:
2016-10-21| PLFP| Fee payment|Year of fee payment: 2 |
2017-10-26| PLFP| Fee payment|Year of fee payment: 3 |
2018-01-05| PLSC| Publication of the preliminary search report|Effective date: 20180105 |
2018-09-28| PLFP| Fee payment|Year of fee payment: 4 |
2019-11-29| PLFP| Fee payment|Year of fee payment: 5 |
2021-08-06| ST| Notification of lapse|Effective date: 20210705 |
优先权:
申请号 | 申请日 | 专利标题
PCT/US2015/010803|WO2016111697A1|2015-01-09|2015-01-09|Apparatus and methods of data synchronization|
[返回顶部]